Josiah Yoder, Ph.D.
Associate Professor
- Milwaukee WI UNITED STATES
- Diercks Hall DH424
- Electrical Engineering and Computer Science
Dr. Josiah Yoder's specialties include computer vision, artificial intelligence, and deep learning.
Education, Licensure and Certification
Ph.D
Computer Engineering
Purdue University
2011
B.S.
Computer Engineering
Rose-Hulman Institute of Technology
2005
Biography
Areas of Expertise
Accomplishments
Outstanding Service Award
2011
Purdue University
Affiliations
- Association for Computer Machinery (ACM) : Member
Social
Selected Publications
Prime Holdout Problems
Preprint arXiv:2205.12932 [math.NT]Milkert, M., Ruchti, A., Yoder, J.
2022
This paper introduces prime holdout problems, a problem class related to the Collatz conjecture. After applying a linear function, instead of removing a finite set of prime factors, a holdout problem specifies a set of primes to be retained. A proof that all positive integers converge to 1 is given for both a finite and an infinite holdout problem. It is conjectured that finite holdout problems cannot diverge for any starting value, which has implications for divergent sequences in the Collatz conjecture.
Prostate Cancer Histology Synthesis Using StyleGAN Latent Space Annotation
Accepted for publication in the Proceedings of Medical Imaging Computing and Computer Assisted Intervention (MICCAI) 2022Daroach, G.B., Duenweg, S.R., Brehler, M., Lowman, A.K., Iczkowski, K.A., Jacobsohn, K.M., Yoder, J.A., and LaViolette, P.S.
2022
The latent space of a generative adversarial network (GAN) may model pathologically-significant semantics with unsupervised learning. To explore this phenomenon, we trained and tested a StyleGAN2 on a high quality prostate histology dataset covering the prostate cancer (PCa) diagnostic spectrum. Our pathologist annotated synthetic images to identify learned PCa regions in the GAN latent space. New points were drawn from these regions, synthesized into images, and given to a pathologist for annotation. 77% of the new points received the same annotation, and 98% of the latent points received the same or adjacent diagnostic stage annotation. This confirms the GAN network can accurately disentangle and model PCa features without exposure to labels in the training process.
Exploring the Exponentially Decaying Merit of an Out-of-Sequence Observation
SensorsYoder, J., Baek, S., Kwon, H., Pack, D.
2018
It is well known that in a Kalman filtering framework, all sensor observations or measurements contribute toward improving the accuracy of state estimation, but, as observations become older, their impact toward improving estimations becomes smaller to the point that they offer no practical benefit. In this paper, we provide an practical technique for determining the merit of an old observation using system parameters. We demonstrate that the benefit provided by an old observation decreases exponentially with the number of observations captured and processed after it. To quantify the merit of an old observation, we use the filter gain for the delayed observation, found by re-processing all past measurements between the delayed observation and the current time estimate, a high cost task. We demonstrate the value of the proposed technique to system designers using both nearly-constant position (random walk) and nearly-constant velocity (discrete white-noise acceleration, DWNA) cases. In these cases, the merit (that is, gain) of an old observation can be computed in closed-form without iteration. The analysis technique incorporates the state transition function, the observation function, the state transition noise, and the observation noise to quantify the merit of an old observation. Numerical simulations demonstrate the accuracy of these predictions even when measurements arrive randomly according to a Poisson distribution. Simulations confirm that our approach correctly predicts which observations increase estimation accuracy based on their delay by comparing a single-step out-of-sequence Kalman filter with a selective version that drops out-of-sequence observations. This approach may be used in system design to evaluate feasibility of a multi-agent target tracking system, and when selecting system parameters including sensor rates and network latencies.
Determining Optimum Drop-out Rate for Neural Networks
The Bridge, The Magazine of IEEE-Eta Kappa Nu (IEEE-HKN)Yoder, J.
2018
Dropout is used to reduce overfitting in neural networks. Past research determines the optimum dropout rate for a dataset but does not compare optimal dropout rates across datasets. The purpose of this project is to investigate a correlation in optimum dropout rates between datasets that are non-spatial, non-time series, and have heterogeneous inputs. One dataset with these properties is credit card default data, which contains each client’s age, education, etc., and whether they defaulted on their credit card. A dropout rate of 0.5 is widely used but does not always optimize performance. For each dataset, deep neural network models were trained over various dropout rates and training-set sizes. The experimental results presented here show that the optimum dropout rate falls anywhere within its possible range from 0 to 1, that even 10% dropout can significantly improve performance over no dropout, and that dropout can be effective even on small datasets.
Using objective ground-truth labels created by multiple annotators for improved video classification: A comparative study
Computer Vision and Image UnderstandingSrivastava, G., Yoder, J.A., Park, J., Kak, A.C.
2013
We address the problem of predicting category labels for unlabeled videos in a large video dataset by using a ground-truth set of objectively labeled videos that we have created. Large video databases like YouTube require that a user uploading a new video assign to it a category label from a prescribed set of labels. Such category labeling is likely to be corrupted by the subjective biases of the uploader. Despite their noisy nature, these subjective labels are frequently used as gold standard in algorithms for multimedia classification and retrieval. Our goal in this paper is NOT to propose yet another algorithm that predicts labels for unseen videos based on the subjective ground-truth. On the other hand, our goal is to demonstrate that the video classification performance can be improved if instead of using subjective labels, we first create an objectively labeled ground-truth set of videos and then train a classifier based on such a ground-truth so as to predict objective labels for the set of unlabeled videos.
Cluster-based distributed face tracking in camera networks
IEEE Transactions on Image ProcessingYoder, J., Medeiros, H., Park, J., Kak, A.C.
2010
In this paper, we present a distributed multicamera face tracking system suitable for large wired camera networks. Unlike previous multicamera face tracking systems, our system does not require a central server to coordinate the entire tracking effort. Instead, an efficient camera clustering protocol is used to dynamically form groups of cameras for in-network tracking of individual faces. The clustering protocol includes cluster propagation mechanisms that allow the computational load of face tracking to be transferred to different cameras as the target objects move. Furthermore, the dynamic election of cluster leaders provides robustness against system failures. Our experimental results show that our cluster-based distributed face tracker is capable of accurately tracking multiple faces in real-time. The overall performance of the distributed system is comparable to that of a centralized face tracker, while presenting the advantages of scalability and robustness.